Stochastic Feature Transformation with Divergence-Based Out-of-Handset Rejection for Robust Speaker Verification

نویسندگان

  • Man-Wai Mak
  • Chi-Leung Tsang
  • Sun-Yuan Kung
چکیده

The performance of telephone-based speaker verification systems can be severely degraded by linear and non-linear acoustic distortion caused by telephone handsets. This paper proposes to combine a handset selector with stochastic feature transformation to reduce the distortion. Specifically, a GMMbased handset selector is trained to identify the most likely handset used by the claimants, and then handset-specific stochastic feature transformations are applied to the distorted feature vectors. This paper also proposes a divergence-based handset selector with out-of-handset (OOH) rejection capability to identify the ‘unseen’ handsets. This is achieved by measuring the Jensen difference between the selector’s output and a constant vector with identical elements. The resulting handset selector is combined with the proposed feature transformation technique for telephone-based speaker verification. Experimental results based on 150 speakers of the HTIMIT corpus show that the handset selector, either with or without OOH rejection capability, is able to identify the ‘seen’ handsets accurately (98.3% in both cases). Results also demonstrate that feature transformation performs significantly better than the classical cepstral mean normalization approach. Finally, by using the transformation parameters of the ‘seen’ handsets to transform the utterances with correctly identified handsets and processing those utterances with ‘unseen’ handsets by cepstral mean subtraction, verification error rates are reduced significantly (from 12.41% to 6.59% on average).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cluster-Dependent Feature Transformation with Divergence-based Out-of-Handset Rejection for Robust Speaker Verification

This paper proposes a divergence-based cluster selector with out-of-handset (OOH) rejection capability to identify the ‘unseen’ handsets. This is achieved by measuring the Jensen difference between the selector’s output and a constant vector with identical elements. The resulting cluster selector is combined with a feature-based channel compensation algorithm for telephone-based speaker verific...

متن کامل

Divergence-based out-of-class rejection for telephone handset identification

Research has shown that handset selectors can be used to assist telephone-based speech/speaker recognition. Most handset selectors, however, simply select the most likely handset from a set of known handsets even for speech coming from an ‘unseen’ handset. This paper proposes a divergence-based handset selector with out-of-handset (OOH) rejection capability to identify the ‘unseen’ handsets. Th...

متن کامل

Divergence-based Out-of-class Reject Identificatio

Research has shown that handset selectors can be used to assist telephone-based speech/speaker recognition. Most handset selectors, however, simply select the most likely handset from a set of known handsets even for speech coming from an ‘unseen’ handset. This paper proposes a divergence-based handset selector with out-of-handset (OOH) rejection capability to identify the ‘unseen’ handsets. Th...

متن کامل

Sun-Yuan Kung, Speaker Verification from Coded Telephone Speech Using Stochastic Feature Transformation and Handset Identification

A handset compensation technique for speaker verification from coded telephone speech is proposed. The proposed technique combines handset selectors with stochastic feature transformation to reduce the acoustic mismatch between different handsets and different speech coders. Coder-dependent GMM-based handset selectors are trained to identify the most likely handset used by the claimants. Stocha...

متن کامل

Environment adaptation for robust speaker verification by cascading maximum likelihood linear regression and reinforced learning

In speaker verification over public telephone networks, utterances can be obtained from different types of handsets. Different handsets may introduce different degrees of distortion to the speech signals. This paper attempts to combine a handset selector with (1) handset-specific transformations, (2) reinforced learning, and (3) stochastic feature transformation to reduce the effect caused by t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • EURASIP J. Adv. Sig. Proc.

دوره 2004  شماره 

صفحات  -

تاریخ انتشار 2004